{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 深度学习用于计算机视觉\n",
    "### 卷积神经网络\n",
    "使用卷积神经网络对 MNIST 进行分类"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using TensorFlow backend.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:63: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:488: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3626: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3454: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from keras import layers\n",
    "from keras import models\n",
    "\n",
    "model = models.Sequential()\n",
    "model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28,28,1)))\n",
    "model.add(layers.MaxPooling2D((2,2)))\n",
    "model.add(layers.Conv2D(64, (3, 3), activation='relu'))\n",
    "model.add(layers.MaxPooling2D((2, 2)))\n",
    "model.add(layers.Conv2D(64, (3, 3), activation='relu'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " * input_shape = (28,28,1)->MNIST输入的图像的尺寸\n",
    " * input_shape = (image_height, image_width, image_channels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_________________________________________________________________\n",
      "Layer (type)                 Output Shape              Param #   \n",
      "=================================================================\n",
      "conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       \n",
      "_________________________________________________________________\n",
      "max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         \n",
      "_________________________________________________________________\n",
      "conv2d_2 (Conv2D)            (None, 11, 11, 64)        18496     \n",
      "_________________________________________________________________\n",
      "max_pooling2d_2 (MaxPooling2 (None, 5, 5, 64)          0         \n",
      "_________________________________________________________________\n",
      "conv2d_3 (Conv2D)            (None, 3, 3, 64)          36928     \n",
      "=================================================================\n",
      "Total params: 55,744\n",
      "Trainable params: 55,744\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看出：随着网络深入，width,height逐渐变小，通道数量由 layer 的第一个参数决定的。\n",
    "### 接下来\n",
    "把(3, 3, 64)张量输出到密集连接分类器网络中：首先扁平化到1D，然后就是Dense层的操作了~"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:1255: calling reduce_prod_v1 (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "keep_dims is deprecated, use keepdims instead\n"
     ]
    }
   ],
   "source": [
    "model.add(layers.Flatten())\n",
    "model.add(layers.Dense(64, activation='relu'))\n",
    "model.add(layers.Dense(10, activation='softmax'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_________________________________________________________________\n",
      "Layer (type)                 Output Shape              Param #   \n",
      "=================================================================\n",
      "conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       \n",
      "_________________________________________________________________\n",
      "max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         \n",
      "_________________________________________________________________\n",
      "conv2d_2 (Conv2D)            (None, 11, 11, 64)        18496     \n",
      "_________________________________________________________________\n",
      "max_pooling2d_2 (MaxPooling2 (None, 5, 5, 64)          0         \n",
      "_________________________________________________________________\n",
      "conv2d_3 (Conv2D)            (None, 3, 3, 64)          36928     \n",
      "_________________________________________________________________\n",
      "flatten_1 (Flatten)          (None, 576)               0         \n",
      "_________________________________________________________________\n",
      "dense_1 (Dense)              (None, 64)                36928     \n",
      "_________________________________________________________________\n",
      "dense_2 (Dense)              (None, 10)                650       \n",
      "=================================================================\n",
      "Total params: 93,322\n",
      "Trainable params: 93,322\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Flatten 层将3*3*64的3D张量转换成了576的1D张量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/optimizers.py:711: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:2857: calling reduce_sum_v1 (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "keep_dims is deprecated, use keepdims instead\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:2861: The name tf.log is deprecated. Please use tf.math.log instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow_core/python/ops/math_grad.py:1424: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Use tf.where in 2.0, which has the same broadcast rule as np.where\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:671: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Call initializer instance with the dtype argument instead of passing it to the constructor\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:949: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:936: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:2353: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.\n",
      "\n",
      "Epoch 1/5\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:158: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:163: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:172: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:180: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.\n",
      "\n",
      "WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:187: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.\n",
      "\n",
      "60000/60000 [==============================] - 21s 348us/step - loss: 0.1801 - acc: 0.9436\n",
      "Epoch 2/5\n",
      "60000/60000 [==============================] - 21s 345us/step - loss: 0.0497 - acc: 0.9844\n",
      "Epoch 3/5\n",
      "60000/60000 [==============================] - 23s 382us/step - loss: 0.0344 - acc: 0.9891\n",
      "Epoch 4/5\n",
      "60000/60000 [==============================] - 22s 359us/step - loss: 0.0274 - acc: 0.9915\n",
      "Epoch 5/5\n",
      "60000/60000 [==============================] - 21s 353us/step - loss: 0.0210 - acc: 0.9935\n",
      "10000/10000 [==============================] - 1s 79us/step\n"
     ]
    }
   ],
   "source": [
    "from keras.datasets import mnist\n",
    "from keras.utils import to_categorical\n",
    "\n",
    "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
    "\n",
    "train_images = train_images.reshape((60000, 28, 28, 1))\n",
    "train_images = train_images.astype('float32') / 255\n",
    "\n",
    "test_images = test_images.reshape((10000, 28, 28, 1))\n",
    "test_images = test_images.astype('float32') / 255\n",
    "\n",
    "train_labels = to_categorical(train_labels)\n",
    "test_labels = to_categorical(test_labels)\n",
    "\n",
    "model.compile(optimizer='rmsprop',\n",
    "             loss='categorical_crossentropy',\n",
    "             metrics=['accuracy'])\n",
    "model.fit(train_images, train_labels, epochs=5, batch_size=64)\n",
    "\n",
    "test_loss, test_acc = model.evaluate(test_images, test_labels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.99"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_acc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**可以看到，精确度提高到了99%！**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 卷积神经网络学习到了：\n",
    " * 平移不变性（translation invariant）:在一个角落学习到了一个特征，可以应用到任何区域。\n",
    " * 模式的层次空间结构（spatial hierarchies of patterns）:可以逐层学习到更抽象、更复杂的视觉概念。\n",
    " \n",
    "输出的深度轴代表了各种不同的过滤器。能够得到不同的特征。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 为什么需要下采样？\n",
    "1. 减少需要学习的参数数量\n",
    "2. 可以让卷积层的观察窗口越来越大（每个元素对应原始图形中的像素数量）\n",
    "\n",
    "下采样的方法：\n",
    " * 增长步幅——取样减少重合像素\n",
    " * 最大池化——四合一取最大值\n",
    " * 平均池化——四合一取平均值\n",
    "总体上说，最大池化比平均池化更好，特征更明显"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
